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Guidelines  for  Statistical  Treatment  of  Less  Than 
Detection  Limit  Data  in  Dredged  Sediment  Evaluations 

Purpose 

This  technical  note  provides  recommendations  for  methods  of  handling  less 
than  detection  limit  data  to  permit  statistical  comparisons  of  sediment  contami¬ 
nant  or  bioaccumulation  samples  in  dredged  sediment  evaluations.  Ten 
censored  data  methods  are  evaluated;  performance  depends  upon  data  charac¬ 
teristics  such  as  equality  of  variances,  type  of  frequency  distribution,  and  the 
proportion  of  the  data  that  is  below  detection  limit. 

Background 

Regulatory  evaluations  of  dredged  sediments  frequently  require  managers  to 
assess  contaminant  concentrations  in  the  sediments  themselves,  or  in  the  tis¬ 
sues  of  organisms  exposed  to  those  sediments,  as  part  of  a  tiered  testing  proto¬ 
col  (U.S.  Environmental  Protection  Agency /U.S.  Army  Corps  of  Engineers 
(USEPA/USACE)  1991,  1994).  A  typical  Tier  m  assessment,  for  example, 
includes  comparison  of  contaminant  bioaccumulation  in  organisms  exposed  to 
the  dredged  sediment(s)  with  bioaccumulation  in  organisms  exposed  to  a  refer¬ 
ence  sediment.  Statistical  procedures  for  performing  such  comparisons  are 
described  in  detail  in  Appendix  D  of  the  Inland  Testing  Manual  (USEPA/ 
USACE  1994).  However,  most  statistical  protocols  of  the  Inland  Testing 
Manual  cannot  be  applied  directly  in  the  common  situation  where  some  con¬ 
taminant  concentrations  are  reported  only  as  less  than  some  numerical  detec¬ 
tion  limit  (DL).  The  actual  concentrations  of  these  "censored"  data  are 
unknown  and  are  presumed  to  fall  between  zero  and  the  DL. 

Previous  studies  (El-Shaarawi  1989;  El-Shaarawi  and  Esterby  1992;  Gaskin, 
Dafoe,  and  Brooksbank  1990;  Gilliom  and  Hebei  1986;  Gleit  1985;  Haas  and 
Scheff  1990;  Hebei  1990;  Hebei  and  Cohn  1988;  Hebei  and  Gilliom  1986; 
Kushner  1976;  Newman  and  others  1989;  Porter  and  Ward  1991)  have  exam¬ 
ined  a  variety  of  methods  for  handling  data  that  include  nondetects.  Some  of 
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these  studies  identified  methods  that  perform  well  in  parameter  estimation 
problems,  for  example,  when  a  mean  contaminant  concentration  must  be  esti¬ 
mated  to  determine  compliance  with  air  or  water  quality  standards.  Censored 
data  methods  recommended  for  estimation  are  based  on  maximum  likelihood 
and  regression  procedures.  However,  there  is  no  consensus  on  which  cen¬ 
sored  data  methods  should  be  used  when  samples  must  be  compared  with 
each  other,  as  in  the  Tier  El  bioaccumulation  assessments  mentioned  above, 
and  accurate  parameter  estimation  is  unnecessary.  The  most  commonly  used 
methods  are  the  simplest  techniques,  namely  deletion  of  nondetects  or  substitu¬ 
tion  of  a  constant  such  as  zero,  DL,  or  one-half  DL  (DL/2)  for  the  unknown 
observations.  Interim  guidance  in  the  draft  Inland  Testing  Manual  recom¬ 
mended  substitution  of  DL/2  until  statistically  validated  guidelines  could  be 
developed. 

To  address  the  need  for  censored  data  guidelines  for  sample  comparisons  in 
dredged  sediment  evaluations,  a  simulation  study  was  conducted  to  assess  the 
performance  of  10  censored  data  methods.  The  study  procedures  and  general 
results  have  been  described  elsewhere  (Clarke  1994,  1995).  The  10  censored 
data  methods  are  described  in  this  technical  note,  with  recommendations 
regarding  which  method  to  use  in  specific  situations. 

Additional  Information 


For  additional  information,  contact  the  author,  Ms.  Joan  U.  Clarke,  (601)  634- 
2954,  or  the  manager  of  the  Environmental  Effects  of  Dredging  Programs, 

Dr.  Robert  M.  Engler,  (601)  634-3624. 

Note:  The  contents  of  this  technical  note  are  not  to  be  used  for  advertising, 
publication,  or  promotional  purposes.  Citation  of  trade  names  does  not  consti¬ 
tute  an  official  endorsement  or  approval  of  the  use  of  such  products. 


Introduction 

Monte  Carlo  simulations  were  conducted  to  evaluate  the  performance  of  10 
censored  data  methods  using  the  statistical  procedures  recommended  in  the 
Inland  Testing  Manual  (USEPA/USACE  1994,  Appendix  D).  Specifically,  this 
entailed  comparison  of  one  or  more  dredged  sediments  with  a  reference  sedi¬ 
ment  using  the  Least  Significant  Difference  (LSD)  test  on  untransformed,  log- 
transformed,  or  rankit-transformed  data  (refer  to  the  decision  tree.  Figure 
D-5A,B  of  Inland  Testing  Manual). 

Simulations  were  conducted  using  equal  and  unequal  variances  with  several 
sample  sizes,  statistical  population  distributions,  and  numbers  of  sediments  to 
be  simultaneously  compared  with  a  reference.  Censoring  was  imposed  at  a 
"detection  limit"  equivalent  to  20,  40,  60,  80,  or  95  percent  of  the  reference  sedi¬ 
ment  population  for  each  set  of  simulations;  uncensored  data  were  also  ana¬ 
lyzed.  Parameter  specifications  for  the  simulations  are  described  in  detail  in 
Clarke  (1995).  The  entire  focus  of  the  study  was  on  small  sample  size. 
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necessitated  by  the  high  cost  of  contaminant  residue  chemical  analysis;  equal 
and  unequal  sample  sizes  ranging  from  three  to  eight  replicates  were  used  in 
the  simulations.  A  total  of  335,000  simulations  were  performed.  Simulation 
results  were  verified  using  271  comparisons  of  actual  chemical  concentration 
data  from  sediment  and  tissue  samples  analyzed  for  several  dredged  material 
contaminant  evaluation  projects  (Clarke  1995). 

In  the  simulations  and  verifications,  censored  data  methods  were  evaluated 
for  power  and  for  type  I  statistical  error  rate  (a).  Power  is  the  probability  of 
the  statistical  test  (in  this  study,  the  LSD  test)  to  detect  true  significant  differ¬ 
ences.  Type  I  error  rate  is  the  probability  of  the  statistical  test  to  falsely  detect 
as  significant  a  difference  that  does  not  exist  in  the  populations  from  which 
the  samples  were  drawn.  By  convention,  a  is  generally  set  to  0.05  in  biologi¬ 
cal  testing,  that  is,  a  false  positive  error  rate  of  5  percent  or  less  is  considered 
acceptable.  Ideally,  power  should  be  about  95  percent,  but  this  is  frequently 
impossible  due  to  fiscal  or  logistical  constraints  on  the  number  of  samples  that 
can  be  collected  or  analyzed.  Censored  data  methods  should  be  chosen  to 
maximize  power,  and  if  possible,  minimize  a.  Although  all  methods  can  be 
expected  to  lose  power  as  the  amount  of  censoring  increases,  the  best  methods 
should  minimize  loss  of  power  and  inflation  of  a  with  increased  censoring. 

Censored  Data  Methods 

Ten  censored  data  methods  amenable  to  simulations  using  SAS®  (SAS  Insti¬ 
tute,  Inc.  1988a,b,c)  were  chosen  for  evaluation: 

•  DL.  Substitution  of  the  detection  limit  for  all  nondetects. 

•  DL/2.  Substitution  of  one-half  the  detection  limit  for  all  nondetects. 

•  ZERO.  Substitution  of  zero  for  all  nondetects. 

When  data  are  subsequently  transformed  to  rankits,  the  above  three  meth¬ 
ods  produce  the  exact  same  results  (assuming  all  uncensored  observations  in 
the  sample  are  greater  than  DL),  and  are  called  CONST  for  substitution  of  any 
constant  between  0  and  DL. 

•  UNIF.  Nondetects  are  replaced  by  ordered  observations  xi  ( i  =  1, 2...nc, 
where  tic  is  the  number  of  censored  observations  in  the  sample)  between  0 
and  DL,  where 

xi  =  DL(i  -  l)/(nc  -1) 

and  xi  =  DL/2  when  tic  =  1.  This  produces  a  uniform  distribution  symmetric 
around  DL/2  (Gilliom  and  Helsel  1986). 

•  UNIFR.  Replacement  of  nondetects  by  random  numbers  from  a  uniform 
distribution  between  0  and  DL.  This  may  be  done  using  a  random  numbers 
table  or  a  random  number  generator  such  as  the  RANUNI  function  in  SAS 
(SAS  Institute,  Inc.  1988c). 


•  MLE  NORM.  Maximum  likelihood  estimation  of  below-DL  values  assum¬ 
ing  a  normal  distribution,  using  the  SAS  LIFEREG  procedure  (SAS  Institute, 
Inc.  1988a). 

•  MLE  LOGN.  Maximum  likelihood  estimation  of  below-DL  values  assum¬ 
ing  a  lognormal  distribution,  using  the  SAS  LIFEREG  procedure  (SAS  Insti¬ 
tute,  Inc.  1988a). 

•  MLE  WEIB.  Maximum  likelihood  estimation  of  below-DL  values  assuming 
a  Weibull  distribution,  using  the  SAS  LIFEREG  procedure  (SAS  Institute, 

Inc.  1988a). 

In  the  three  MLE  methods,  the  i  =  l,2...nc  censored  observations  are  replaced 
by  the  values  corresponding  to  the  first  nc  of  n  evenly  spaced  percentiles  of 
the  MLE-generated  distribution. 

•  NR.  Substitution  of  estimated  values  from  a  normal  distribution  using 
linear  regression  of  above-DL  concentrations  versus  their  rankits  (Gilliom 
and  Helsel  1986). 

•  LR.  Substitution  of  estimated  values  from  a  lognormal  distribution  using 
linear  regression  of  logarithms  of  above  DL  concentrations  versus  their 
rankits  (Gilliom  and  Helsel  1986,  Clarke  1992). 

The  regression  equation  calculated  in  these  methods  is  used  to  extrapolate 
values  for  the  censored  observations.  For  LR,  antilogs  of  the  extrapolated  val¬ 
ues  are  used. 

SAS  program  statements  for  the  methods  described  above  are  provided  in 
Appendix  D  of  USEPA/USACE  (1994)  or  can  be  obtained  from  the  author. 
Several  other  censored  data  methods  are  available  but  were  considered  unsuit¬ 
able  for  this  study  (Clarke  1995).  In  particular,  deletion  of  censored  data  is  not 
recommended  as  it  results  in  excessive  loss  of  information  and  power  as  the 
amount  of  censoring  increases.  Slymen,  de  Peyster,  and  Donohoe  (1994)  de¬ 
scribe  and  recommend  tobit  analysis  using  the  SAS  LIFEREG  procedure  for 
comparing  samples  with  values  below  DL  in  environmental  studies.  The 
authors  present  statistical  justification  for  this  method,  but  it  could  not  be  com¬ 
pared  with  the  other  methods  described  in  this  technical  note  due  to  the  limita¬ 
tions  of  SAS  LIFEREG  output  for  conducting  large  numbers  of  simulations. 

Considerations  in  Selecting  the  Best  Censored  Data  Methods 

Simulation  results  clearly  indicate  that  no  single  censored  data  method 
works  best  in  all  situations.  Before  selecting  a  method  for  treatment  of  nonde- 
tects  in  contaminant  evaluations,  the  investigator  should  determine,  if  possible, 
certain  characteristics  of  the  data.  Are  variances  equal  or  unequal  among  the 
samples  being  compared?  If  variances  are  equal,  what  is  the  coefficient  of  vari¬ 
ation  (CV  =  standard  deviation  *  mean)  of  the  combined  samples?  If  vari¬ 
ances  are  unequal,  do  they  increase  as  sample  means  increase,  or  do  they 
follow  no  particular  pattern  in  relation  to  sample  means  (mixed  variances)? 
When  the  samples  are  combined,  are  the  residuals  normally  distributed. 
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lognormally  distributed  (that  is,  do  they  pass  the  test  of  normality  following 
log  transformation),  or  nonnormally  distributed?  The  type  of  data  distribution 
and  the  variance  characteristics  appear  to  have  the  greatest  influence  upon  the 
censored  data  methods.  For  the  limited  ranges  considered  in  this  study,  sam¬ 
ple  size  and  number  of  treatments  being  compared  seem  to  have  less  effect 
upon  the  censored  data  methods. 

To  determine  type  of  data  distribution  and  variance  characteristics  for  cen¬ 
sored  samples,  investigators  can  apply  two  or  more  of  the  censored  data  meth¬ 
ods  described  above  to  obtain  a  range  of  possible  variances  and  CVs.  The 
revised  data  (both  untransformed  and  log-transformed)  can  then  be  tested  for 
normality  and  equality  of  variances  using  procedures  such  as  those  described 
in  Appendix  D  of  USEPA/USACE  (1994). 

When  samples  are  severely  censored,  investigators  may  be  able  to  make  an 
educated  guess  concerning  distribution  and  variance  characteristics  based  on 
uncensored  data  for  the  same  contaminant  or  on  historical  data  from  the  same 
location.  Of  the  271  comparisons  performed  using  real  chemical  data  in  the 
verification  study,  half  had  equal  variances  among  the  samples  being  com¬ 
pared,  while  30  percent  had  mixed  variances  and  20  percent  had  variances  pro¬ 
portional  to  the  sample  means.  Sixty  percent  of  the  samples  passed  the 
Shapiro-Wilk's  test  of  normality  (USEPA/USACE  1994),  25  percent  passed 
when  data  were  log-transformed,  and  15  percent  failed.  Nevertheless,  in  the 
absence  of  information  to  the  contrary,  it  may  be  reasonable  to  assume  a  log¬ 
normal  distribution  for  environmental  trace  chemical  data  (El-Shaarawi  1989; 
Gilliom,  Hirsch,  and  Gilroy  1984;  Kushner  1976;  Newman  and  others  1989;  Ott 
and  Mage  1976;  Porter  and  Ward  1991;  Travis  and  Land  1990).  A  normal  dis¬ 
tribution  is  unlikely  for  contaminant  concentration  data  when  the  CV  exceeds 
1,  as  such  a  distribution  would  include  a  fair  amount  of  negative  concentra¬ 
tions.  For  example,  a  normal  distribution  contains  =17  percent  negative  values 
when  the  CV  =  1  and  =31  percent  negative  values  when  the  CV  =  2. 

The  next  consideration  should  be  the  relative  importance  of  power  versus 
type  I  error  rate  (a)  in  the  statistical  comparisons.  The  censored  data  methods 
were  compared  based  on  power  adjusted  for  a  (that  is,  mean  power  minus 
mean  a).  The  most  powerful  methods  generally  had  a  in  the  range  of  0.05  to 
0.10  for  amounts  of  censoring  up  through  80  percent,  but  much  higher  o  at 
95-percent  censoring.  If  it  is  crucial  to  maintain  a  at  approximately  0.05  or 
less,  it  may  be  necessary  to  select  somewhat  less  powerful  methods  in  certain 
cases.  In  a  number  of  situations,  there  are  no  suitably  powerful  methods  with 
a  <  0.05. 

When  several  methods  had  adjusted  mean  power  within  0.05  of  the  uncen¬ 
sored  data,  priority  was  given  to  the  simplest  method(s).  In  order  of  increas¬ 
ing  complexity,  the  censored  data  methods  were  constant  substitution  (DL, 
DL/2,  ZERO),  substitution  from  a  uniform  distribution  (UNIF  and  UNIFR), 
regression  techniques  (NR  and  LR),  and  maximum  likelihood  techniques 
(MLE  LOGN,  MLE  NORM,  and  MLE  WEIB).  In  most  situations,  the  simplest 
methods  were  also  the  most  powerful. 
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Table  1.  Recommended  Censored  Data  Methods  for 
Small  Samples  to  Be  Used  in  Statistical  Comparisons 


Amount 

of 

Censoring, 

% 

Variances 

Distribution 

Coefficient 

of 

Variation 

Data  Transformation4  | 

Log 

Noneb 

Rankit 

<20 

Equal 

All 

<  0.25 

DL 

DL 

CONST, 

UNIF 

Normal 

026  -  1 

DL 

DL/2, 
UNIF,  DL 

CONST, 

UNIF 

Lognormal, 

Nonnormal 

0.26  - 1 

BB 

CONST, 

UNIF 

>1 

DL/2, 

DL,  UNIF 

HH 

CONST, 

UNIF 

Increase 

Normal 

ZEROc 

LR 

_ d 

as  Means 
Increase 

Lognormal,  Nonnormal 

DL 

CONST, 

UNIF 

Mixed 

Normal 

_ d 

DL 

CONST, 

MLE 

NORM 

Lognormal,  Nonnormal 

DLC 

MLE 

WEIBC 

21-40 

i 

Equal 

All 

<£0.25 

DL 

DL 

CONST, 

UNIF 

Normal 

0.26  -  0.5 

DL 

DL/2,  DL 

CONST, 

UNIF 

Lognormal, 

Nonnormal 

0.26  -  1 

DL/2 

CONST, 

UNIF 

>1 

DL/2,  DL 

CONST, 

UNIF 

Increase 
as  Means 
Increase 

Normal 

DL 

_ d 

Lognormal,  Nonnormal 

DL,  DL/2 

CONST, 

UNIF 

Mixed 

Normal 

DL/2C 

ZERO , 
DL/2C 

CONST, 

MLE 

WEIB 

Lognormal,  Nonnormal 

DL 

CONST 

(Continued) 

a  Method(s)  in  bold  indicate  most  powerful  transformation(s).  Methods  in  italics  have  mean  a 
between  0.06  and  0.10;  nonitalidzed  methods  have  mean  a  <  0.06. 

6  Untransformed  data  generally  should  not  be  used  with  lognormal  or  nonnormal  distributions. 
c  All  methods  with  acceptable  power  have  o  a  0.06. 
d  All  methods  have  unacceptably  low  power  and/or  high  a. 
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Table  1.  (Concluded) 


Amount 

of 

Censoring, 

% 


41  -  60 


61  -  80 


Variances 

Distribution 

Coefficient 

of 

Variation 

Equal 

All 

£0.25 

Normal 

0.26  -  1 

Lognormal, 

Nonnormal 

>0.25 

Increase 

Normal 

as  Means 

Lognormal 

Increase 

Nonnormal 

Mixed 

Normal 

Lognormal 

Nonnormal 

Equal 

All 

<0.25 

Normal 

0.26  -  1 

Lognormal, 

0.26  -  0.5 

Nonnormal 

0.51  -  1 

Lognormal 

>1 

Nonnormal 

>1 

Increase 
as  Means 

Normal 

Increase 

Lognormal 

Nonnormal 

Mixed 

Normal 

Lognormal,  Nonnormal 

,  Transformation 


Log 


DL/2 


DL/2 


DL/2 


DL 


DL/2 


DL,  DL/2 


DL/2 


DL/2 


DL/2C 


DL/2 


DL/2 


DL12C 


DL/2 

UNIF 


None 


DL/2 


DL/2, 

ZERO 


DL/2, 

ZERO 


Rankit 


CONST 


DL/2 


DL/2  CONST 


CONST 


CONST 


CONST 


CONST 


CONST* 


CONST * 


CONST 


CONST 


CONST 


CONST 


CONST 


CONST * 


CONST 


CONST 


CONST 


CONST c 


Recommendations  for  Censored  Data  Methods 

Censored  data  methods  recommended  for  various  situations  of  equal  and 
unequal  variances,  statistical  frequency  distributions,  CVs,  data  transforma¬ 
tions,  and  amounts  of  censoring  are  given  in  Table  1.  When  two  or  three 
methods  are  essentially  equivalent  in  power,  type  I  error  rate,  and  simplicity, 
all  are  listed  in  the  table  in  order  of  decreasing  power.  Method(s)  highlighted 
in  bold  indicate  the  data  transformation(s)  having  the  highest  adjusted  power 
in  a  given  situation.  Methods  in  italics  have  mean  a  between  0.06  and  0.10; 
nonitalicized  methods  have  mean  a  <  0.06.  When  the  recommended  method 
has  mean  a  >  0.06,  if  possible,  an  alternative  (although  usually  less  powerful) 
method  having  lower  a  is  given  in  the  table.  Situations  in  which  all  methods 
have  unacceptably  low  power  and/or  high  a  are  also  indicated  in  tire  table. 
Methods  having  adjusted  mean  power  within  0.05  of  the  most  powerful 
method  for  a  given  censoring  percentile  and  variance-distribution-CV  combina¬ 
tion  and  at  least  half  the  power  of  the  uncensored  data  for  that  combination 
were  considered  to  have  acceptable  power. 

In  most  situations  shown  in  Table  1,  a  single  powerful  method  can  be 
applied  regardless  of  which  data  transformation,  if  any,  might  be  needed.  For 
example,  when  censoring  is  £20  percent,  variances  are  equal,  and  CV  is  <0.25, 
DL  should  be  substituted  for  all  nondetects.  The  tests  of  assumptions  in 
Appendix  D  of  USEPA/USACE  (1994)  would  then  determine  whether  untrans¬ 
formed,  log-transformed,  or  rankit-transformed  data  should  be  used  in  the  sta¬ 
tistical  comparisons.  Alternatively,  UNIF  could  be  used  with  rankits.  These 
methods  have  approximately  equal  power.  However,  if  censoring  is  between 
40  and  60  percent,  variances  are  equal,  and  CV  is  £0.25,  CONST  with  rankits 
should  be  preferred,  as  the  power  of  this  combination  exceeds  that  of  any 
method  with  untransformed  or  log-transformed  data.  In  cases  when  power  is 
exceptionally  low,  especially  when  variances  are  unequal,  a  different  method 
for  each  transformation  may  be  required  to  maximize  power. 

Following  is  a  discussion  of  the  individual  censored  data  methods  and  the 
situations  in  which  they  should  or  should  not  be  used. 

DL  is  generally  the  preferred  method  at  low  to  moderate  proportions  of  cen¬ 
soring,  especially  when  the  CV  is  low,  or  when  variances  are  unequal  and 
data  are  not  normally  distributed.  In  particular,  DL  performs  better  than  all 
other  methods  and  much  better  than  the  other  simple  substitution  methods  at 
£40  percent  censoring  when  the  CV  is  extremely  low  (£0.25).  In  most  cases 
DL  should  not  be  used  with  data  that  are  highly  censored  (>60  percent  censor¬ 
ing).  DL  has  low  power  at  <40  percent  censoring  with  log  transformation 
when  data  are  normally  distributed  and  variances  increase  with  increasing 
means. 

DL/2  generally  begins  to  surpass  DL  in  power  as  CV  and  censoring  increase. 
DL/2  tends  to  have  slightly  higher  a  than  DL  when  variances  are  equal. 

DL/2  should  not  be  used  when  the  CV  is  extremely  low  (£0.25)  and  less  than 


40  percent  of  the  data  are  censored.  DL/2  also  has  low  power  and/or  high  a 
at  <60  percent  censoring  when  data  are  normally  distributed  and  variances  are 
unequal. 

ZERO  is  recommended  for  use  with  imtransformed,  normally  distributed 
data  in  a  few  situations.  In  general,  ZERO  should  not  be  used  with  log- 
transformed  data  as  this  amounts  to  deletion  of  the  censored  data,  resulting  in 
low  power  and  high  a.  One  exception,  in  which  ZERO  proved  to  be  the  most 
powerful  method  with  log-transformed  data,  was  normal  distribution  at 
<40  percent  censoring  when  variances  increase  as  means  increase.  However,  a 
in  this  case  exceeds  0.05. 

CONST  is  almost  universally  appropriate  for  rahkit-transformed  data,  and  is 
usually  the  most  powerful  method  with  rankits.  In  several  situations  CONST 
with  rankits  is  equally  or  more  powerful  than  the  best-performing  method 
with  untransformed  or  log-transformed  data.  However,  when  data  are  nor¬ 
mally  distributed,  variances  increase  with  means,  and  censoring  is  <40  percent, 
all  methods  with  rankits  have  unacceptably  low  power  compared  with  log- 
transformed  and  imtransformed  data.  Type  I  error  rate  is  high  for  CONST 
with  rankits  when  variances  are  mixed  and  data  are  normally  distributed;  in 
almost  all  other  cases,  a  does  not  exceed  0.06. 

UNIF  is  the  most  powerful  method  with  log-transformed  data  at  high 
amounts  of  censoring  when  data  are  nonnormal  and  variances  increase  as 
means  increase.  When  used  with  rankits,  UNIF  is  essentially  equal  in  power 
to  CONST  in  most  situations.  Type  I  error  rate  tends  to  be  extremely  low  for 
UNIF,  especially  as  censoring  increases.  Therefore,  UNIF  can  be  a  suitable 
alternative  to  the  most  powerful  method  in  some  situations  when  low  a  is 
desired. 

UN1FR  is  generally  slightly  less  powerful,  with  slightly  higher  a,  than  UNIF. 
Power  is  low  for  most  situations  at  60  percent  censoring  or  more.  UNIFR  is 
not  the  recommended  method  in  any  situation. 

MLE  NORM  is  recommended  in  two  situations  as  an  alternative  to  the  most 
powerful  method  when  low  a  is  desired:  with  rankits  at  220  percent  censoring 
when  variances  are  mixed  and  data  are  normal,  and  with  log-transformed  data 
at  21  to  40  percent  censoring  when  variances  increase  with  means  and  data  are 
normal.  MLE  NORM  has  low  power  at  60  percent  censoring  or  more,  and 
also  in  many  cases  at  40  percent  and  even  20  percent  censoring.  MLE  NORM 
should  not  be  used  with  log-transformed  data  when  the  CV  is  high  as  this 
method  may  substitute  negative  concentrations  for  the  nondetects. 

MLE  LOGN  is  not  the  most  powerful  method  in  any  situation.  Power  is 
low  when  censoring  exceeds  40  percent,  and  a  tends  to  be  high  for  log- 
transformed  data  in  many  cases  at  low  amounts  of  censoring. 


MLE  WEIB  is  recommended  for  rankits  as  an  alternative  to  CONST  at  21  to 
40  percent  censoring  when  variances  are  mixed,  data  are  normally  distributed, 
and  low  a  is  required.  MLE  WEIB  should  also  be  used  with  rankits  at  <20  per¬ 
cent  censoring  when  variances  are  mixed  and  data  are  not  normally  distrib¬ 
uted.  In  most  other  cases  MLE  WEIB  has  less  power  than  MLE  LOGN,  and  is 
inappropriate  for  log-transformed  data,  or  for  any  data  when  censoring 
exceeds  40  percent. 

LR  and  NR  appear  to  be  inappropriate  as  censored  data  methods  for  statisti¬ 
cal  comparisons  of  small  samples  in  most  circumstances.  Power  is  generally 
low  even  at  20  percent  censoring,  and  declines  precipitously  as  censoring 
increases.  Conversely,  a  is  generally  high  even  at  20  percent  censoring  and 
increases  dramatically  as  censoring  increases,  sometimes  approaching  1.  LR  is 
recommended  only  for  untransformed  data  at  20  percent  censoring  when  vari¬ 
ances  are  proportional  to  means  and  data  are  normally  distributed. 

The  simple  substitution  (DL,  DL/2,  ZERO,  CONST)  and  uniform  distribution 
(UNEF,  UNIFR)  methods  can  be  applied  regardless  of  the  amount  of  censoring. 
The  MLE  methods  cannot  be  used  when  all  observations  in  a  sample  are  be¬ 
low  DL.  The  regression  methods  (LR,  NR)  require  at  least  three  uncensored 
observations  in  each  sample,  and  thus  are  inapplicable  for  small  sample  sizes 
when  censoring  exceeds  about  20  percent. 


Verifications 

Verification  results  overwhelmingly  support  the  simulation  study  conclu¬ 
sions  that  simple  substitution  or  uniform  distribution  methods  work  best  in 
most  situations  to  prepare  censored  samples  for  statistical  comparisons.  In  no 
case  did  the  maximum  likelihood  or  regression  techniques  have  sufficient 
power  in  the  verifications  to  be  considered  useful.  Verification  results  favor 
the  use  of  DL  at  20  to  40  percent  censoring  when  the  CV  is  low  (<0.25),  and 
DL/2  otherwise.  Although  generally  less  powerful  than  DL/2,  UNTF  and 
UNIFR  have  low  a  and  perform  well  at  20  to  40  percent  censoring  when  log 
transformation  is  not  used.  ZERO  also  performs  well,  especially  at  40  to  60  per¬ 
cent  censoring,  but  should  not  be  used  with  log  transformation.  No  methods 
have  sufficient  power  to  be  useful  at  80  percent  censoring  except  DL/2  when 
the  CV  is  high  (>0.75). 

Summary 

Simulation  and  verification  results  indicate  that,  in  most  cases,  the  sophisti¬ 
cated  statistical  techniques  recommended  for  estimation  problems  involving 
censored  data  are  unnecessary  or  even  inappropriate  for  statistical  comparisons 
of  small,  censored  data  samples.  In  general,  the  simple  substitution  methods 
work  best  to  maintain  power  and  control  type  I  error  rate  in  statistical  compari¬ 
sons.  Recommended  steps  in  selecting  the  best  censored  data  method  for  a 
given  situation  are  listed  below. 


For  each  contaminant  for  which  some  data  are  reported  as  nondetect  or  <DL: 

•  Determine  proportion  of  data  that  are  censored  (all  samples  combined). 

•  Determine  whether  variances  are  equal  or  unequal  among  samples.  If  un¬ 
equal,  do  the  variances  increase  as  means  increase,  or  are  the  variances  seem¬ 
ingly  random  (mixed)? 

•  Determine  CV  of  combined  samples. 

•  Determine  whether  combined  sample  residuals  are  distributed  normally, 
lognormally,  or  nonnormally.  If  CV  1,  assume  lognormal  or  nonnormal 
distribution. 

•  Refer  to  Table  1  to  determine  most  appropriate  method  given  the  amount  of 
data  censoring,  properties  of  variances,  and  type  of  statistical  distribution. 
Where  possible,  preference  should  be  given  to  methods  in  bold. 

•  If  it  is  crucial  to  maintain  a  at  approximately  0.05  or  less,  choose  nonitali- 
dzed  methods  where  available  in  Table  1. 

•  Apply  selected  method  to  censored  data,  then  continue  with  tests  of  assump¬ 
tions  and  statistical  comparison  procedures  as  outlined  in  USEPA/USACE 
(1994).  Avoid  a  data  transformation  for  which  no  method  is  given  in  Table  1 
due  to  low  power  or  excessively  high  a. 

•  Do  not  attempt  statistical  comparisons  of  severely  censored  samples  in  situ¬ 
ations  where  no  censored  data  methods  are  considered  appropriate.  In  such 
cases,  the  probability  of  an  erroneous  outcome  is  high. 

If  it  is  impossible  to  determine  characteristics  of  the  variances  or  statistical 
distribution  for  censored  data  samples,  use  DL  for  up  to  40  percent  censoring 
or  DL/2  for  40  to  80  percent  censoring.  An  alternative,  although  somewhat 
less  powerful  in  many  situations,  is  to  substitute  any  constant  between  0  and 
DL,  convert  the  data  to  rankits,  and  then  follow  the  nonparametric  decision 
procedures  in  Figure  D-5B  of  USEPA/USACE  (1994).  Power  loss  using 
CONST  with  rankits,  when  compared  with  DL  or  DL/2  on  un transformed  or 
log-transformed  data,  is  generally  around  5  to  10  percent  when  variances  are 
equal  and  data  are  lognormally  or  nonnormally  distributed,  <4  percent  when 
variances  are  equal  and  data  are  normally  distributed,  up  to  14  percent  when 
variances  are  proportional  to  the  means,  and  up  to  6  percent  when  variances 
are  mixed.  No  matter  what  technique  is  used,  power  will  generally  decline  as 
censoring  increases.  Beyond  60  to  80  percent  censoring,  it  is  unlikely  that  any 
technique  will  perform  acceptably. 

It  is  quite  possible  that  an  evaluation  including  a  number  of  sediments  and 
contaminants  would  produce  comparisons  involving  several  different  combina¬ 
tions  of  censoring  proportions,  variance  characteristics,  and  data  frequency  dis¬ 
tributions.  Following  the  guidelines  herein  would  result  in  the  application  of 
more  than  one  censored  data  method  to  the  project  data.  This  is  entirely 
acceptable  when  the  censored  data  methods  are  selected  for  the  purpose  of 
maximizing  power  and  minimizing  type  I  error.  What  is  not  acceptable  is  to  try 
several  censored  data  methods  for  the  purpose  of  finding  one  that  will  produce  a 
desired  statistical  comparison  outcome. 


The  simulation  study  did  not  address  the  performance  of  censored  data 
methods  in  the  common  situation  of  multiple  detection  limits  within  a  set  of 
replicate  observations.  However,  the  simple  substitution  methods  shown  to 
work  well  in  nearly  all  cases  with  single-detection  limit  censored  samples  can 
be  applied  without  modification  to  multiple-detection  limit  censored  samples. 
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